Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retries - ReverseProxy.ErrorHandler based approach #61

Merged
merged 11 commits into from
Feb 15, 2024
Merged

Retries - ReverseProxy.ErrorHandler based approach #61

merged 11 commits into from
Feb 15, 2024

Conversation

nstogner
Copy link
Contributor

@nstogner nstogner commented Jan 18, 2024

Allow for retrying failed requests to backends. Default to 1 retry per lingo-request.

Fixes #48

Builds on work done by @alpe in #64 and #51.

@nstogner nstogner force-pushed the retries branch 4 times, most recently from cf732f9 to 5f54718 Compare January 18, 2024 18:20
@nstogner nstogner changed the title Proof-of-concept techniques for retrying proxied requests Retries - ReverseProxy.ErrorHandler based approach Jan 23, 2024
@nstogner nstogner changed the title Retries - ReverseProxy.ErrorHandler based approach WIP: Retries - ReverseProxy.ErrorHandler based approach Jan 23, 2024
@nstogner
Copy link
Contributor Author

@alpe here is a variation of your middleware-based retry approach (#64) that does not require wrapping the http.ResponseWriter. Would love to get your initial thoughts (still have some work to do).

Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice start and I am happy to see a fresh approach on the retry problem.

pkg/proxy/request.go Show resolved Hide resolved

func (pr *proxyRequest) httpRequest() *http.Request {
clone := pr.r.Clone(pr.r.Context())
clone.Body = io.NopCloser(bytes.NewReader(pr.body))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scenario with x-model header set, this behaviour is not correct. The body is empty and the original Reader overwritten.
it may also be worth to optimize for the happy path (without a retry) and lazy fill the buffer (with a TeeReader for example).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll look at the TeeReader approach. I think we could also do a nil check on pr.body before resetting close.Body here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took the approach of checking for nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added a test case for making sure the right response body was reaching the backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fixed now. I included a test case for this.

pkg/proxy/request.go Show resolved Hide resolved
pkg/proxy/handler.go Outdated Show resolved Hide resolved
pkg/proxy/handler.go Outdated Show resolved Hide resolved
pkg/proxy/handler.go Show resolved Hide resolved
Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice tests! 🏄
It would be good to have the X-Model header scenario covered, when implemented

pkg/proxy/handler.go Show resolved Hide resolved
backendCode: http.StatusInternalServerError,
backendBody: `{"err":"oh no!"}`,
expCode: http.StatusBadGateway,
expBody: `{"error":"Bad Gateway"}` + "\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When retry is enabled, the original backend error msg is never passed to the client. This can be a valid design decision but we should be sensible what error codes are "retryable". Do you have some scenarios for 500 in mind that can succeed on a retry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we should probably only filter 500s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, I am going to just pipe through the response in the case where it got an http response by filtering out retry errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

pkg/proxy/handler_test.go Outdated Show resolved Hide resolved
@nstogner nstogner changed the title WIP: Retries - ReverseProxy.ErrorHandler based approach Retries - ReverseProxy.ErrorHandler based approach Feb 3, 2024
Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress!
As elaborated on discord, the X-Model header scenario is not fully handled for retries. The request body should not be empty.

pkg/proxy/request.go Show resolved Hide resolved
// This point could be reached if a bad response code was sent by the backend
// or
// if there was an issue with the connection and no response was ever received.
if err != nil && pr.attempt < h.MaxRetries {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to abort early when the request context is canceled. I have debugged retries in #65 that never hit a backend

Suggested change
if err != nil && pr.attempt < h.MaxRetries {
if err != nil &&
r.Context().Err() == nil &&
pr.attempt < h.MaxRetries {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are checks for the context being done in the reverse proxy .ServeHTTP function. Do you know what cases it would fix to also do the check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 the result to the end use may be the same but you get there with fewer steps. Your code would be in charge instead of relying on the stdlib proxy. I stumbled upon this when I was debugging and saw the loop iteration but no backend hit.

pkg/proxy/handler.go Show resolved Hide resolved
@nstogner
Copy link
Contributor Author

nstogner commented Feb 8, 2024

@alpe Added a test case in the handler tests for the failure you mentioned. Seeing it now.

@nstogner nstogner requested a review from alpe February 9, 2024 18:56
pkg/proxy/request.go Outdated Show resolved Hide resolved
Copy link
Contributor

@samos123 samos123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, could you rebase on main so the additional system tests get triggered?

Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! The PR look good now and covers all the scenarios discussed . 👍 Great work

// This point could be reached if a bad response code was sent by the backend
// or
// if there was an issue with the connection and no response was ever received.
if err != nil && pr.attempt < h.MaxRetries {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 the result to the end use may be the same but you get there with fewer steps. Your code would be in charge instead of relying on the stdlib proxy. I stumbled upon this when I was debugging and saw the loop iteration but no backend hit.

pkg/proxy/handler.go Show resolved Hide resolved
pkg/proxy/handler_test.go Show resolved Hide resolved
@nstogner nstogner merged commit 48df469 into main Feb 15, 2024
6 checks passed
@nstogner nstogner deleted the retries branch February 15, 2024 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lingo should retry on proxy failure
3 participants